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(54) Method and system for automatic recognition of digital indicia images deliberately distorted 
to be non readable 



(57) A method and system for processing mail 
pieces or substrates containing data printed thereon 
involves scanning a mail piece or substrate and obtain- 
ing information concerning the printed data. The infor- 
mation is processed to determine if the data is readable. 
Non readable data information is processed to deter- 
mine if the non readable data is due to predetermined 
causes of a first type or predetermined causes of a sec- 
ond type. Substrates or mail pieces with non readable 



data due to predetermined causes of the first type may 
be processed in a first manner and processing sub- 
strates or mail pieces with non readable data due to pre- 
determined causes of the second type may be 
processed in a second manner. The printing may be 
optical character reconizable, bar code of any type or 
any other form of printed data. 
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Description 

The present invention relates to printing and verify- 
ing images and, more particularly, to printing and verify- 
ing digital indicia, such as those used for proof of s 
postage payment or other value printing applications. 

In mail preparation, a mailer prepares a mailpiece 
or a series of mailpieces for delivery to a recipient by a 
carrier service such as the United States Postal Service 
or other postal service or a private carrier delivery serv- 10 
ice. The carrier services, upon receiving or accepting a 
mailpiece or a series of mailpieces from a mailer, proc- 
esses the mailpiece to prepare it for physical delivery to 
the recipient. Payment for the postal service or private 
carrier delivery service may be made by means of value is 
metering devices such as postage meters. In systems 
of this type, the user prints an indicia, which may be dig- 
ital token or other evidence of payment on the mailpiece 
or on a tape that is adhered to the mailpiece. The post- 
age metering systems print and account for postage 20 
and other unit value printing such as parcel delivery 
service charges and tax stamps. 

These postage meter systems involve both prepay- 
ment of postal charges by the mailer (prior to postage 
value imprinting) and post payment of postal charges by 25 
the mailer (subsequent to postage value imprinting). 
Prepayment meters employ descending registers for 
securely storing value within the meter prior to printing 
whole post payment (current account) meters employ 
ascending registers account for value imprinted. Postal 30 
charges or other terms referring to postal or postage 
meter or meter system as used herein should be under- 
stood to mean charges for either postal charges, tax 
charges, private carrier charges, tax service or private 
carrier service, as the case may be, and other value 35 
metering systems, such as certificate metering systems 
such as is disclosed in European Patent Application of 
Cordery, Lee, Pintsov, Ryan and Weiant, filed August 
21, 1996, and published under No. 0762692. for 
SECURE USER CERTIFICATION FOR ELECTRONIC 40 
COMMERCE EMPLOYING VALUE METERING SYS- 
TEM and assigned to Pitney Bowes, Inc. Mail pieces as 
used herein includes both letters of all types and parcels 
of all types. 

Some of the varied types of postage metering sys- as 
terns are shown, for example, in U.S. Patent No. 
3.978.457 for MICRO COMPUTERIZED ELECTRONIC 
POSTAGE METER SYSTEM, issued August 31, 1976; 
U.S. Patent No. 4,301,507 for ELECTRONIC POSTAGE 
METER HAVING PLURAL COMPUTING SYSTEMS, so 
issued November 17, 1981; and U.S. Patent No. 
4,579.054 for STAND ALONE ELECTRONIC MAILING 
MACHINE, issued April 1. 1986. Moreover, the other 
types of metering systems have been developed which 
involve different printing systems such as those employ- ss 
ing thermal printers, ink jet printers, mechanical printers 
and other types of printing technologies. Examples of 
some of these other types of electronic postage meters 



are described in U.S. Patent No. 4,168.533 for MICRO- 
COMPUTER MINIATURE POSTAGE METER, issued 
September 18, 1979; and U.S. Patent No. 4,493,252 for 
POSTAGE PRINTING APPARATUS HAVING A MOVA- 
BLE PRINT HEAD AN A PRINT DRUM, issued January 
15, 1985. These systems enable the postage meter to 
print variable information, which may be alphanumeric 
and graphic type information. 

Postage metering systems have also been devel- 
oped which employ encrypted information on a mail- 
piece. The postage value for a mailpiece may be 
encrypted together with the other data to generate a 
digital token. A digital token is encrypted information 
that authenticates the information imprinted on a mail- 
piece such as postage value. Examples of postage 
metering systems which generate and employ digital 
tokens are described in U.S. Patent No. 4,757,537 for 
SYSTEM FOR DETECTING UNACCOUNTED FOR 
PRINTING IN A VALUE PRINTING SYSTEM, issued 
July 12, 1988; U.S. Patent No. 4,831,555 for SECURE 
POSTAGE APPLYING SYSTEM, issued May 15. 1989; 
U.S. Patent No. 4.775,246 for SYSTEM FOR DETECT- 
ING UNACCOUNTED FOR PRINTING IN A VALUE 
PRINTING SYSTEM, issued October 4, 1988; U.S. Pat- 
ent No. 4,725,718 for POSTAGE AND MAILING INFOR- 
MATION APPLYING SYSTEMS, issued February 16, 
1988. These systems, which may utilize a device 
termed a Postage Evidencing Device (PED) or Postal 
Security Device (PSD), employ an encryption algorithm 
to encrypt selected information to generate the digital 
token. The encryption of the information provides secu- 
rity to prevent altering of the printed information in a 
manner such that any change in a postal revenue block 
is detectable by appropriate verification procedures. 

Encryption systems have also been proposed 
where accounting for postage payment occurs at a time 
subsequent to the printing of the postage. Systems of 
this type are disclosed in U.S. Patent No. 4,796,193 for 
POSTAGE PAYMENT SYSTEM FOR ACCOUNTING 
FOR POSTAGE PAYMENT OCCURS AT A TIME SUB- 
SEQUENT TO THE PRINTING OF THE POSTAGE 
AND EMPLOYING A VISUAL MARKING IMPRINTED 
ON THE MAILPIECE TO SHOW THAT ACCOUNTING 
HAS OCCURRED, issued January 3, 1989; U.S. Patent 
No. 5,293,319 for POSTAGE METERING SYSTEM, 
issued March 8, 1994; and, U.S. Patent No. 5,375,172, 
for POSTAGE PAYMENT SYSTEM EMPLOYING 
ENCRYPTION TECHNIQUES AND ACCOUNTING 
FOR POSTAGE PAYMENT AT A TIME SUBSEQUENT 
TO THE PRINTING OF THE POSTAGE, issued 
December 20, 1994. 

Other postage payment systems have been devel- 
oped not employing encryption. Such a system is 
described in U.S. Patent No. 5,391.562 for SYSTEM 
AND METHOD FOR PURCHASE AND APPLICATION 
OF POSTAGE USING PERSONAL COMPUTER, 
issued February 21 , 1995. This patent describes a sys- 
tems where end-user computers each include a modem 
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for communicating with a computer and a postal author- 
ity. The system is operated under control of a postage 
meter program which causes communications with the . 
postal authority to purchase postage and updates the 
contents of the secure non-volatile memory. The post- 5 
age printing program assigns a unique serial number to 
every printed envelope and label, where the unique 
serial number includes a meter identifier unique to that 
end user. The postage printing program of the user 
directly controls the printer so as to prevent end users w 
from printing more that one copy of any envelope or 
label with the same serial number. The patent suggests 
that by capturing and storing the serial numbers on all 
mailpieces, and then periodically processing the infor- 
mation, the postal service can detect fraudulent duplica- 15 
tion of envelopes or labels. In this system, funds are 
accounted for by and at the mailer site. The mailer cre- 
ates and issues the unique serial number which is not 
submitted to the postal service prior to mail entering the 
postal service mail processing stream. Moreover, no 20 
assistance is provided to enhance the deliverability of 
the mail beyond current existing systems. 

Another system not employing encryption of the 
indicium is disclosed in U.S. Patent No. 5,612,889 for 
MAIL PROCESSING SYSTEM WITH UNIQUE MAIL- 25 
PIECE AUTHORIZATION ASSIGNED IN ADVANCE OF 
MAILPIECES ENTERING CARRIER SERVICE MAIL 
P ROCESSING STREAM. 

As can be seen from the references noted above, 
various postage meter designs may include electronic 30 
accounting systems which may be secured within a 
meter housing or smart cards or other types of portable 
accounting systems. 

Recently, the United States Postal Service has pub- 
lished proposed draft specifications for future postage 35 
payment systems, including the Information Based Indi- 
cium Program (IBIP) Indicium Specification dated June 
13, 1996 and the Information Based Indicia Program 
Postal Security Device Specification dated June 13, 
1996. These are Specifications disclosing various post- 40 
age payment techniques including various types of 
secure accounting systems that may be employed, as 
for example, a single chip module, multi chip module, 
and multi chip stand alone module (See for example, 
Table 4.6-1 PSD Physical Security Requirements, Page 45 
4-4 of the Information Based Indicia Program Postal 
Security Device Specification). 

The use of encrypted indicia involve the use of var- 
ious verification techniques to insure that the indicia is 
valid. This may be implemented via machine reading so 
the indicia and subsequent validation. Alternatively, the 
encrypted indicia data may be human readable and 
thereafter manually entered into a computing system for 
validation. The nature of the validation process requires 
the retrieval of sufficient data to execute the validation ss 
process. A problem with validation exists, however, 
when the encrypted indicia is defective such that suffi- 
cient data necessary for the validation process cannot 



be obtained either by machine or human reading. This 
is a case where data available to the verifying party is 
insufficient for .validation of the indicium. Accordingly, a 
decision must be made as how to further process such 
mail, either to reject the mail piece or to place the mail 
piece in the mail delivery stream. A similar situation 
exists of verifiable (non-encrypted) indicia which are 
printed by various metering systems. In such systems, 
the imprinted indicia is verifiable so long as certain indi- 
cia characteristics are legible as, for example, tels inten- 
tion included in the indicia. In such case, the imprinted 
indicia, if legible, can be compared to stored indicia 
specimens for the meter system. 

It has been discovered that a system can be imple- 
mented to increase the percentage of mail having an 
encrypted indicia which can be placed in the mail deliv- 
ery stream without significantly compromising revenue 
security. 

It has been discovered that certain characteristics 
exist in mail, having an encrypted indicia which is illegi- 
ble which allows for a determination being made to 
process the mail for delivery due to characteristics of 
the mail piece without compromising revenue security. 

It is an object of the present invention to provide a 
mechanism for determining the acceptance or rejection 
of mail into a mail delivery stream. 

It is a further objective of the present invention to 
provide a validation system which allows for processing 
of both machine readable and non machine readable 
indicia. 

It is yet a further objective of the present invention 
to distinguish between classes of non machine readable 
indicia to allow efficient processing of the mail. 

It is still a further objective of the present invention 
to provide a means to distinguish between acceptable 
and non-acceptable substrates of various types having 
printing thereon which is illegible. 

It is yet another objective of the present invention to 
provide a process for determining whether defects in the 
printing of a substrate or mail pieces (as for example in 
the indicia) are likely to be intentionally created based 
on neural network processing of data. 

With these and other objectives in view, a method 
embodying the present invention includes processing 
mail pieces containing data printed thereon scans a 
mail piece and obtains information concerning the data 
printed on the mail piece. The information is processed 
to determine if the data is readable. Non readable data 
information is processed to determine if the non reada- 
ble data is due to predetermined causes of a first type or 
predetermined causes of a second type. 

In accordance with a feature of the present inven- 
tion, a substrate may be used instead of a mail piece 
and the printed information may be any type of printed 
information such as a printed indicium. The printing may 
be optical character recognzable type printing, bar code 
printing of any type or other types of printing. 

In accordance with another feature of the present 
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invention, mail pieces or substrates with non readable 
data due to the first type of predetermined causes are 
processed in a first manner and mail pieces or sub- 
strates with non readable data due to the second type of 
predetermined causes are processed in a second man- 
ner. 

Reference is now made to the following figures 
wherein like reference numerals designate similar ele- 
ments in the various views and in which: 

FIGURE 1 is a block diagram of a mail validation 
system incorporating the present invention to 
increase the percentage of mail pieces which can 
be properly processed; 

FIGURE 2 a-g are a series of depiction's of various 
portions of a numeric character which maybe part 
of an encrypted indicia helpful in a full understand- 
ing of the present invention; 
FIGURE 3 is a diagrammatic representation of a 
neural network system helpful in one form of imple- 
mentation of the present invention; 
FIGURE 4 is a flow chart of the system shown in 
FIGURE 1. 

General Overview 

The present method allows for automatic recogni- 
tion of images which were deliberately distorted for the 
purpose of rendering them to be non readable to avoid 
detection as counterfeited. The practical significance of 
this invention lies in the fact that: 

a) it allows automatic detection and outsorting of 
mail pieces with highly probable fraudulent indicia; 

b) raises bar for aspired counterfeiters in a sense 
that it requires more time, knowledge and money to 
artificially create non readable images which can 
resemble naturally occurring damaged, but legiti- 
mately printed images with high fidelity. 

Therefore, the invention closes a potentially wide 
open loophole in the postage payment system based on 
digital images incorporating validation codes (digital 
tokens or truncated ciphertexts), thus creating secure 
systems trusted by mailers and posts payment system. 
In the postage payment system which is based on dig- 
ital images incorporating validation codes (digital tokens 
or truncated ciphertexts), it is customarily assumed that 
the verifying party (usually a Postal Administration) can 
automatically capture and recognize information printed 
in the digital indicium and validate the indicium authen- 
ticity and information integrity by using an appropriate 
cryptographic algorithm. The rate of error free auto- 
matic recognition is assumed to be high due to special 
data format and error control data in the indicium with 
which the postage evidencing device (franking machine, 
a computer printer and the like) prints the indicium. In 
the case of a reading error, that is the rejection of the 



indicium as unreadable by the recognition process, it is 
assumed that there is an error recovery mechanism 
based on manual key entry of the information in the ind- 
icium into the verifying computer. This arrangement 

5 opens an opportunity for unscrupulous mailers to test 
the robustness of the system by printing images of legit- 
imate looking digital indicia artificially distorted to render 
them both human and machine unreadable. In this 
case, the verifying party is left with an unpleasant policy 

io decision: should the mail piece be accepted for delivery 
or rejected based on illegibility of the information in the 
indicium. There is no logical basis for making such a 
policy decision: if the indicium is legitimate but of poor 
quality, then is it was paid for, and, the mail piece must 

is be accepted, but there is no confidence that it is legiti- 
mate; if the indicium is a counterfeit then it can be 
rejected or investigated but there is no confidence that it 
is counterfeit. This dilemma emphasizes the need to 
find a way to automatically discriminate with a high level 

20 of confidence between legitimate and counterfeited 
images of poor quality. The point about the confidence 
level is important. Due to the very large number of mail 
pieces processed daily, the process of discrimination is 
statistical by nature. This means that the probability of 

25 correct identification of artificially distorted counterfeit 
images has to be high enough, for example 80% or 
90%. Since the majority of the mailers are honest 
regardless of the postal verification policy, it can be rea- 
sonably assumed a very large proportion of mail items 

30 carry a legitimate proof of payment. Thus, the majority 
of postage for the mail are legitimately paid. Accord- 
ingly, only a small percentage of the total mail stream 
may be counterfeits or illegitimate copies. If some pro- 
portions of those are generated by an artificial distortion 

35 method outlined above, a robust discrimination process 
can outsort a large portion of those for investigation, 
leaving a smaller number of undecidable pieces that 
can be safely accepted into the postal stream for deliv- 
ery without further investigation. The monetary loss 

40 associated with undecidable and potentially counter- 
feited pieces is so small that it may not warrant any fur- 
ther investigation and the whole payment system can be 
considered robust and trustworthy. This outsorting proc- 
ess substantially improves the effectiveness of tnvesti- 

45 gation of non-readable indicia. 

The Method 

The discrimination between artificially and naturally 
so distorted images utilize three principles: 

1 . The naturally occurring defects of the printed ind- 
icium image are due to specific interaction between 
the printing mechanism, printing media and printing 

55 ink. Such defects are classifiable and have repeat- 
able, measurable and statistically stable patterns. 

2. The indicium printing process and image have 
been designed with special provisions such as spe- 



55 



4 



BNSDOCID: <EP 0881 601 A2 I > 



7 



EP 0 881 601 A2 



8 



cially selected print font, size of characters, etc. The 
indicium data contains redundancy such as error 
detection and correction, as well as other redun- 
dant data. Due to these special provisions taken to 
ensure human and machine readability, these 5 
images are readable with a high probability. 
3. The statistics of naturally occurring and rare non 
readable images is not available to aspiring coun- 
terfeiters. It takes a long period of time and effort to 
collect such statistics without having exposure to a 10 
very large volume of non readable indicia. Since 
vendors of franking machines in possession of such 
data should treat it as sensitive, similar to the treat- 
ment of printing dies for conventional mechanical 
meters, it will not be generally publicly available. 15 



Artificially distorted non readable images have 
measurable patterns statistically different from the pat- 
terns of naturally occurring images mentioned in the 
first principle. 20 



When an image is digitized it may be represented 
as a collection of pixels, color, gray scale level or binary 25 
values with associated X and Y coordinates. The digital 
image of an indicium consists of pixels representing 
graphical elements and characters. The characters cru- 
cial for indicium validation may be in certain systems 
only numerals of certain shape, reducing the total 30 
number of shapes to be considered for recognition pur- 
pose from hundreds for a typical text reading application 
to 10. 

The following are examples of different type of sta- 
tistics: 35 

total number of pixels in the image with the value 
above a certain predetermined threshold; 
number of pixels of a certain value in prespecified 
positions; *o 
average number of pixels of a certain value in each 
character shape; 

maximum number of pixels of a certain value in 
each character shape; 

minimum number of pixels of a certain value in 45 
each character shape; 

average number of pixels of a certain value in each 
graphical element; 

maximum number of pixels of a certain value in 
each graphical element; so 
minimum number of pixels of a certain value in 
each graphical element; 

total number of pixels of a certain value in each 
graphical element. 

55 

Process: Designing Classifier 

1. Collect and digitize a representative sample of 



human non readable images. 

2. Compute image statistics (of the type described 
above). 

3. Compute statistical parameters for the statistics: 
such as mean values, correlations, dispersions, 
standard deviations. 

4. Classify the results and define a statistical pat- 
tern recognition algorithm based on the computed 
parameters (features) selected from the set of all 
computed statistical parameters based on their dis- 
criminating power. 

This last process can be implemented in a classical 
fashion, i.e. when the process of features selection is 
guided by a human designer and then one of the tradi- 
tional classifiers is employed (see for example, Hand- 
book of Pattern Recognition and Image Processing, ed. 
by T. Young and K. Fu, Academic Press, 1986). 

Alternatively, a neural network approach can be 
very effective for this particular application. In this case 
a three layer network can be employed. The first layer 
consists of the number of input nodes equal to the 
number of preselected image statistics, for example 30 
for each character shape, 9 for graphic elements and 3 
for total number of pixels, that is 42 input nodes. The 
intermediate level may have, for example, 10 nodes. On 
how to select the intermediate level: see for example, R. 
Hecht-Nielsen, Neural Networks, Addison-Wesley, 
1991). The output layer consist of two nodes, corre- 
sponding to human readable or human non readable. 
Such network can then be trained with a supervision on 
the basis of a collected sample of readable and non 
readable images. In such training, the supervisor 
presents the network with input data together with the 
correct result (readable, nonreadable). The process 
converges to a stable state, when weights assigned to 
connections between nodes are stable and assigned 
certain values. The process of training, for example, can 
employ a known algorithm of back propagation of errors 
(see, R. Hecht-Nielsen, Neural Networks, Addison- 
Wesley, 1991). After training, the network is employed 
to classify real images, which were not a part of the ini- 
tial training set. One interesting method of using net- 
work is to "interrogater" the network, upon conclusion of 
the training process as to which inputs were deciding 
factors in during the classification process. In practice 
this means listing connection weights between the 
nodes in descending order and selecting inputs contrib- 
uted most to these weights. Once that is done, the 
selected inputs then can be used as features in a con- 
ventional statistical classifier. In such manner, the com- 
puting resources required to classify images can be 
minimized, since conventional classifiers are typically 
more computationally effective than neural networks. 
The process can also be implemented without a neural 
network by cataloging the various types of illegible 
printed data. These categories include printed data 
intentionally made illegible. 
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Target system and process 

Once a classifier has been designed and imple- 
mented, it can be employed in the image validation sys- 
tem. 

System Organization And Operation 

Reference is now made to FIGURE 1. A series of 
mail piece shown generally at 102 are placed on a mail 
transport 104. The mail pieces contain an indicia having 
a validation code. This has been termed an encrypted 
indicia. The encrypted indicia may contain digital tokens 
used in the validation process. Indicium data must be 
recovered to verify the proof of payment imprinted on 
the mail piece. The data necessary to do this is depend- 
ent on the form and architecture of the cryptographic 
process utilized. Encrypted and non-encrypted informa- 
tion needs to be recovered to initiate most validation 
processes. The mail pieces 102 are transported past a 
scanner 106 by mail transport 104, The scanner scans 
necessary information from the mail piece to enable the 
validation process to proceed and for other purposes in 
connection with the mail processes. In one embodi- 
ment, the scanner may capture and digitize the image of 
the indicium for subsequent processing. 

If the information recovered by the scanner 106 is 
inadequate for computer recognition unit 108 to process 
the data, the captured digitized image may be sent to a 
key entry unit 110 where a determination has been 
made that the captured image is likely to be human 
readable. 

If the captured digitised image is sent to a key entry 
unit 110, the mail piece involved may be held in the 
buffer station 111 while the key entry process is imple- 
mented. In either event where the computer recognition 
unit 108 has sufficient information or where the mail 
pieces sent to the key entry unit and sufficient informa- 
tion is recovered, the data is sent to a cryptographic val- 
idation processor unit 112. The processor unit 112 
determines, based on the available data from the mail 
piece, whether the printed indicia is valid. After this 
process has been completed, the mail pieces proceed, 
either along the transport or from the buffer station to a 
sorting station 1 14 to be sorted based on the determi- 
nation made by the cryptographic validation processor 
unit 1 12 to either a first sortation bin 116 for accepted 
mail which will be put into the mail delivery stream or to 
sortation bin 118 where the cryptographic process has 
indicated that the mail piece has an invalid imprint. In 
such an event, this is a cryptographic indication of an 
invalid mail piece which is a fraudulent mail piece in that 
the data recovered from the mail piece is internally 
inconsistent. 

A third category of mail is still present in the mall 
stream. This is mail where the mail piece data is not 
machine recognizable nor is it human readable. This 
mail is processed to be sorted by mail sorting station 



.114 into either first sortation bin 1 1 6 of accepted mail or 
into a 120 third sortation bin 120 for mail requiring fur- 
. ther investigation. This mail bin 120 is reserved for mail 
pieces which are likely fraudulent but require further 

5 investigation because of the inconclusive nature of the 
recovered data. 

It is expected in general that the number of pieces 
where the indicia is illegible will be relatively small and 
the mail processing system as described herein further 

10 reduces the number of mail pieces sorted into sortation 
bin 120 by allowing mail pieces that are likely not fraud- 
ulent to be accepted. 

Reference is now made to FIGURE 2. It should be 
expressly recognized that various encrypted data 

is including alpha numeric and graphical representations, 
such as bar code, may be employed in the present 
invention. The following description is merely for the 
purpose of illustrating but one of many examples of how 
the present process may be implemented. 

20 FIGURE 2a depicts an image of the numeral 5 
which is shown at 202 as a completely formed defect 
free numeral. That is, all of the graphical elements nec- 
essary to fully represent the numeral are present. FIG- 
URE 2b depicts the same numeral n 5," however, a 

25 portion of the image is missing. Specifically, the top 
most right hand portion shown at area 204 is not 
present. This means the upper right most portion of the 
image contains no imprinted pixels (no black dots or 
markings for the portion of the image). 

30 Reference is now made to FIGURE 2c. The 
numeral "5" now has an additional area 206 missing 
from the numeral "5." 

Should the validation system in FIGURE 1 recover 
an image of a numeral such as shown in FIGURE 2c, for 

35 the particular numeral type set being utilized, three pos- 
sibilities might exist. The recovered numeral intended to 
be printed could be a "3" as shown at 208, could be the 
original numeral "5" as shown at 202 or might be the 
numeral "6" as shown at 210. Based on the recovered 

40 information of elements in FIGURE 2C, any of the pos- 
sibilities shown in FIGURE 2D are potentially plausible. 

Further information may be eliminated from the 
originally imprinted numeral "5" as shown in FIGURE 2a 
causing further difficulties. 

45 At FIGURE 2e, the numeral "5° has a further area 
212 missing from the imprint. However, as shown in 
FIGURE 2f, yet further information can be eliminated 
from the imprint, specifically the area 21 4. 

At this point, four possibilities are now plausible. 

so The four possibilities are shown in FIGURE 2g. 

The originally imprinted numeral "5" with the pixel 
elements missing as shown in FIGURE 2f make it plau- 
sible that that the intended imprinted number could have 
been a "3" as shown at 208, a "5" as shown at 202, "6" 

55 as shown at 210 and now, additionally, an "8" as shown 
at 216. 

Reference is now made to FIGURE 3. A standard 
neural network system is employed to determine the 
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characteristics of human readable and non human 
readable indicia. This is done through an iterative proc- 
ess of learning through a supervisor guided learning 
process. In such a process human intervention is 
included to provide the right identification (human read- 
able or human non readable) for the network based on 
the input indicia for the data set involved. 

The training of the neural network is partially 
dependent upon having a set predetermined number of 
parameters which do not vary. For example, the 
processing of the neural network to determine readabil- 
ity or non-readability, human readability or non-readabil- 
ity is based on a particular printer and equipment, a 
particular scanner and printer. The variables include the 
interaction of the inks with large varieties of papers; 
however, since the other variables are stable, a iterative 
neural network learning process can be implemented to 
improve the decision making process and accepting 
and rejecting mail pieces. This makes the universe of 
different factors which could impact the decision more 
limited and therefore manageable. 

It should be recognized that the relevant image sta- 
tistics and the weights in the network obtained as a 
result of neural network tracking process depend on the 
particular scanner involved and the digitization process 
and the particular indicium printing equipment 
employed. Therefore it may be necessary to retrain the 
neural network where these or other relevant factors 
change. 

The data set to the input layer nodes 1-n shown 
generally at 302 may include, for example, the following 
data concerning an indicia. These may be input at 302 
via the various in put layer nodes 1 -n and may be com- 
prised of the following: 

1. The total number of pixels in the image with a 
value above a certain predetermined threshold. 
That is, if the pixels have different intensity levels 
(gray scale values) the various pixels above a cer- 
tain predetermined threshold level can be counted. 

2. The number of pixels in the indicium of a certain 
value in pre-specrfied positions. 

3. The average number of pixels of a certain value 
in each character shape. 

4. The maximum number of pixels of a certain value 
in each character shape. 

5. The minimum number of pixels of a certain value 
in each character shape. 

6. The average number of pixels of a certain value 
in each graphical element, that is, the pixel values 
in the graphical as opposed to character element of 
the indicium. 

7. The maximum number of pixels of certain value 
in each graphical element. 

8. The minimum number of a certain pixel value in 
each graphical element. 

9. The total number of pixels of a certain value in 
each graphical element. 



It should be expressly recognized that this list of 
input data to the input layer nodes of the neural network 
system can be greatly expanded and/or be different 
from those selected for the purpose of the following 

s example. 

The neural and network system includes an inter- 
mediate layer shown generally at 304. The intermediate 
layer computes a sum of the inputs times the weight. 
This is, again, processed to an output layer shown gen- 

10 erally at 306 to ultimately formulate the characteristics 
of human readable and human nonreadable indicium. It 
should, of course, be recognized that there could be any 
number of intermediate layers. The neural network may 
operate, for example, as described in the text Neural 

is Networks by R. Hecht-Nielsen identified above. In the 
following example of the neural networks, it should be 
recognized that in the neural network each layer is con- 
nected to a preceding layer and the subsequent layer in 
the network. In that connection, each node is connected 

20 to other nodes in the preceding or forwarding layer and 
the connection between the nodes is defined by a 
weight associated through this connection as is shown 
if FIGURE 3. 

Reference is now made to FIGURE 4. A mail piece 

25 is scanned and a digitized image of the indicium 
obtained at 402. The recovered image is subjected to a 
machine recognition process at 404. A determination is 
made at 406 if the indicium is machine readable. If the 
indicium is machine readable, the data is sent to a proc- 

30 ess at 408. A determination is made at 41 0 if the proc- 
essed indicium is valid. If it is valid, the mail piece is 
accepted at 412. The mail piece is then placed in the 
mail delivery stream. If the indicium is determined as 
not valid, the mail piece is rejected at 414. 

35 For an indicium determined as not being machine 
readable, statistics of the indicium are computed at 416. 
These statistics are subjected to neural network or sta- 
tistical classifier processing at 418. A determination is 
made at 420 whether the indicium is likely to be human 

40 readable, that is, the likelihood of the indicium being 
readable is high, the indicium data image is sent for key 
entry at 422. The key entered indicium data is thereafter 
processed at 408 and the process continues as previ- 
ously noted. 

45 Where the indicium is not likely to be human reada- 
ble, a determination is made at 424 whether the image 
defects are likely to have been created artificially. If the 
image defects are determined not to be artificial, the 
mail piece is accepted at 412. If, on the other hand, the 

so image defects are determined likely to be artificial at 
424, the mail piece is rejected and subject to further 
investigation at 426. These mail pieces are subject to 
further investigation to determine whether fraud or other 
improper activities have been involved in creating the 

55 indicium. 

It should be clearly recognized that the decisions as 
explained above regarding expected readability of the 
indicium image is, of course, a statistical one. In other 
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words, the neural or traditional classifier will return a 
yes/no/do not know decision with a certain confidence 
level. The normal process of accepting or rejecting the 
decision based on confidence level is then employed 
based on predetermined (by policy decision) level of 
threshold. If the confidence level is below the threshold 
level, the mail piece can be diverted for manual inspec- 
tion. As a result of such inspection, if the image is 
deemed to be a human nonreadable mail piece, it can 
either be accepted or rejected depending on revenue 
protection policy. More specifically, the determination 
made in decision box 406 is deterministic. Either the 
indicium is machine readable or it is not machine read- 
able. On the other hand, the decisions made in decision 
box 420 and 422 may be statistically determined. Alter- 
natively, these determinations may be made as a result 
of review and classification of various non-machine 
readable indicia. The level of these determinations, this 
is, that the yes/no decision may be formulated by policy 
considerations as to revenue protection and the level of 
confidence required to allow mail to be accepted at 
block 412. 

It should be'recognized that the method and system 
described above is applicable to other coding systems, 
including all forms of bar code. In the case of bar codes, 
the indicium includes several types of redundancy. The 
geometric structure of the bar code allows locating par- 
ticular code words. This structure includes a target to 
help the scanner locate and determine the size and for- 
mat of the bar code, and a specific lattice structure of 
the image. Each code word within the bar code includes 
redundant data, possibly linked to the location of the 
code word within the symbol. The bar code usually also 
includes substantial error detection and correction 
code. The data included in the bar code is redundant, 
for example, the date contains redundant data and the 
postal origin is determined by the meter number through 
a meter database. The mail piece and indicium may 
contain human readable, and OCR readable data that is 
included in the bar code. The verification system can 
check the consistency of this human readable data with 
partial data from the bar code. 

The verification system can employ the redundan- 
cies noted above to detect deliberately fraudulent non 
readable indicia, as well as to help partially decode 
symbols not readable with a standard decode algorithm. 
For example, PDF41 7 has three distinct clusters of code 
words, and substantial structure within a code word. 
The three clusters are used sequentially in separate 
rows. The verification system can check that code 
words are consistent with their rows. 

An attacker may smear the bar code. A naturally 
occurring smear is unlikely, in a well designed system to 
hide all the information and redundancy. The verification 
system can still detect inconsistencies in the image. 

An attacker may alternatively omit printing part of 
an image, imitating nozzle blockage in an ink jet printer 
or printing over a thickness variation with a thermal 



transfer printer. Naturally occurring faults of this type are 
unlikely to completely obliterate the indicium informa- 
tion, so again in this case, the redundancy can be 
detected. 

s While the present invention has been disclosed and 
described with reference to the specific embodiments 
described herein, it will be apparent, as noted above 
and from the above itself, that variations and modifica- 
tions may be made therein. It is, thus, intended in the 

10 following claims to cover each variation and modifica- 
tion that falls within the true spirit and scope of the 
present invention. 

Claims 

15 

1. A method for processing mail pieces containing 
data printed thereon, comprising the steps of: 

a. scanning a mail piece and obtaining informa- 
20 tion concerning said data printed on said mail 

piece; 

b. processing said information to determine if 
said data is readable; and 

c. processing non readable data information to 
25 determine if said non readable data is due to 

predetermined causes of a first type or prede- 
termined causes of a second type. 

2. A method as defined in CLAIM 1 wherein the data 
30 printed on said mail piece is an indicium. 

3. A method as defined in CLAIM 2 comprising the fur- 
ther steps of processing mail pieces with non read- 
able indicium due to predetermined causes of said 

35 first type by entering said mail pieces into a mail 
delivery system and processing mail pieces with 
non readable indicium due to predetermined 
causes of said second type in a second manner. 

40 4. A method as defined in CLAIM 1 or CLAIM 2 further 
comprising the steps of: 

processing mail pieces with non readable data 
due to predetermined causes of said first type 
45 in a first manner and processing mail pieces 

with non readable data due to predetermined 
causes of said second type in a second man- 
ner. 

so 5. A method as defined in CLAIM 1 or CLAIM 2 
wherein said non readable data is non-machine 
readable data. 

6. A method as defined in CLAIM 5 wherein said non 
55 readable data is non-machine readable bar code 

data. 

7. A method as defined in CLAIM 6 wherein said non 
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readable data is non-machine readable PDF417 
type bar code data. 

8. A method as defined in CLAIM 5 wherein said non 
readable data is non-machine readable optical s 
character recognizable type data. 

9. A method as defined in CLAIM 1 or CLAIM 2 
wherein said non readable data is non human read- 
able data. io 

10. A system for processing mail pieces, each having 
an indicium printed thereon, comprising: 

means for scanning mail piece indicium; is 
a computer recognition unit coupled to said 
scanner means for processing output data from 
said scanner; 

a crypto validation processor means coupled to 
said computer recognition means for process- 20 
ing data from said computer recognition means 
to determine whether the scanned data from a 
mail piece is valid; and, 

sortation means coupled to said computer rec- 
ognition means for sorting said mail into 25 
accepted mail pieces, rejected mail pieces and 
mail pieces subject to further investigation, 

11. A system as defined in CLAIM 10 further compris- 
ing: 30 

key entry means connected to said computer 
recognition means and said crypto validation 
processor means for key entry of data which is 
not computer recognizable to said crypto vali- 35 
dation processor unit 

12. A method for processing mail comprising: 

a. scanning a mail piece and obtaining a digi- 40 
tized image of an indicium; 

b. applying a machine recognition process to 
the digitized image; 

c. determining whether the digitized image is 
machine readable; 45 

d. processing machine readable indicia 
through a cryptographic validation process; 
and, 

e. processing non machine readable indicia 
through a process to determine whether the so 
image defects are likely to have been intention- 
ally created. 
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(57) A method and system for processing mail 
pieces or substrates containing data printed thereon 
involves scanning a mail piece or substrate and obtain- 
ing information concerning the printed data. The infor- 
mation is processed to determine if the data is readable. 
Non readable data information is processed to deter- 
mine if the non readable data is due to predetermined 
causes of a first type or predetermined causes of a sec- 
ond type. Substrates or mail pieces with non readable 



data due to predetermined causes of the first type may 
be processed in a first manner and processing sub- 
strates or mail pieces with non readable data due to pre- 
determined causes of the second type may be 
processed in a second manner. The printing may be 
optical character reconizable, bar code of any type or 
any other form of printed data. 
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